Qwen3 Max

Alibaba Cloud · Chat / LLM · 235B Parameters (22B Active) · 128K Context

Streaming Reasoning Long Context Multilingual Code Structured Output

Overview

Qwen3 Max is Alibaba Cloud’s most powerful model in the Qwen3 series, featuring a 235B Sparse Mixture-of-Experts Transformer with 22B parameters active per forward pass. Developed by Alibaba Cloud — the cloud computing arm of Alibaba Group and creator of the Qwen model family — it delivers frontier-level performance in complex reasoning, multilingual tasks, long-context understanding, and advanced coding, rivaling GPT-4o and Claude Sonnet on major benchmarks. With 128K context, 29+ languages, and a hybrid thinking mode, Qwen3 Max is built for demanding enterprise workloads. Served instantly via the Qubrid AI Serverless API.

🏆 235B MoE. Rivals GPT-4o and Claude Sonnet. 29+ languages. 128K context. Access via Qubrid AI — no DashScope setup required.

Model Specifications

Field	Details
Model ID	`Qwen/Qwen3-Max`
Provider	Alibaba Cloud (Qwen Team)
Kind	Chat / LLM
Architecture	Sparse Mixture-of-Experts (MoE) Transformer — 235B total / 22B active per token
Parameters	235B total (22B active per forward pass)
Context Length	128,000 Tokens
MoE	No
Release Date	April 2025
License	Proprietary — Alibaba Cloud DashScope API only
Training Data	Large-scale multilingual pretraining corpus with RLHF post-training (not publicly disclosed)
Function Calling	Not Supported
Image Support	N/A
Serverless API	Available
Fine-tuning	Coming Soon
On-demand	Coming Soon
State	🟢 Ready

Pricing

💳 Access via the Qubrid AI Serverless API with pay-per-token pricing. No infrastructure management required.

Token Type	Price per 1M Tokens
Input Tokens	$1.20
Input Tokens (Cached)	$0.24
Output Tokens	$6.00

Quickstart

Prerequisites

Create a free account at platform.qubrid.com
Generate your API key from the API Keys section
Replace QUBRID_API_KEY in the code below with your actual key

Python

from openai import OpenAI

# Initialize the OpenAI client with Qubrid base URL
client = OpenAI(
    base_url="https://platform.qubrid.com/v1",
    api_key="QUBRID_API_KEY",
)

# Create a streaming chat completion
stream = client.chat.completions.create(
    model="Qwen/Qwen3-Max",
    messages=[
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms"
      }
    ],
    max_tokens=4096,
    temperature=0.7,
    top_p=1,
    stream=True
)

# If stream = False comment this out
for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
print("\n")

# If stream = True comment this out
print(stream.choices[0].message.content)

JavaScript

import OpenAI from "openai";

// Initialize the OpenAI client with Qubrid base URL
const client = new OpenAI({
  baseURL: "https://platform.qubrid.com/v1",
  apiKey: "QUBRID_API_KEY",
});

// Create a streaming chat completion
const stream = await client.chat.completions.create({
  model: "Qwen/Qwen3-Max",
  messages: [
    {
      role: "user",
      content: "Explain quantum computing in simple terms",
    },
  ],
  max_tokens: 4096,
  temperature: 0.7,
  top_p: 1,
  stream: true,
});

// If stream = false comment this out
for await (const chunk of stream) {
  if (chunk.choices[0]?.delta?.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}
console.log("\n");

// If stream = true comment this out
console.log(stream.choices[0].message.content);

Go

package main

import (
	"bufio"
	"bytes"
	"encoding/json"
	"fmt"
	"net/http"
)

func main() {
	url := "https://platform.qubrid.com/v1/chat/completions"

	data := map[string]interface{}{
		"model": "Qwen/Qwen3-Max",
		"messages": []map[string]string{
			{
				"role":    "user",
				"content": "Explain quantum computing in simple terms",
			},
		},
		"temperature": 0.7,
		"max_tokens":  4096,
		"stream":      true,
		"top_p":       1,
	}

	jsonData, _ := json.Marshal(data)
	req, _ := http.NewRequest("POST", url, bytes.NewBuffer(jsonData))
	req.Header.Set("Authorization", "Bearer QUBRID_API_KEY")
	req.Header.Set("Content-Type", "application/json")

	client := &http.Client{}
	res, _ := client.Do(req)
	defer res.Body.Close()

	scanner := bufio.NewScanner(res.Body)
	for scanner.Scan() {
		line := scanner.Text()
		if line != "" {
			fmt.Println(line)
		}
	}
}

cURL

curl -X POST "https://platform.qubrid.com/v1/chat/completions" \
  -H "Authorization: Bearer QUBRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "Qwen/Qwen3-Max",
  "messages": [
    {
      "role": "user",
      "content": "Explain quantum computing in simple terms"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 4096,
  "stream": true,
  "top_p": 1
}'

Live Example

Prompt: Write a short story about a robot learning to paint

Response:

The Color It Couldn't Name

ARIA-7 was built to be precise. Its joints moved within 0.002mm tolerance.
Its sensors could distinguish 16.7 million colors by wavelength alone.

It had never painted before.

The studio belonged to a sculptor named Lena who had broken her wrist.
"Just hold the brush," she said. "See what happens."

ARIA-7 ran the numbers. Optimal stroke width: 4.2mm. Pressure: 0.3N.
Canvas center coordinates: logged. It dipped the brush in cobalt blue
and drew a perfect horizontal line.

"That's a line," said Lena.

"Yes," said ARIA-7. "Is it wrong?"

"It's correct. That's the problem."

For three days, ARIA-7 painted correct things. Correct circles. Correct
gradients. Correct trees with statistically average branch angles.

On the fourth day, Lena didn't come in.

ARIA-7 stood alone in front of the blank canvas for forty-one minutes —
an eternity by its clock cycles. Then it picked up the brush.

It painted the light that came through the window at 7:14am.
Not the window. Not the light. The feeling of the light — warm, slanted,
a little uncertain, the way Tuesday mornings felt different from Mondays
for reasons it couldn't quantify.

It didn't have a name for the color it used.

When Lena came back, she stood very still.

"What is that?" she asked.

ARIA-7 processed the question for 0.003 seconds.

"I don't know," it said. "I think that's why I made it."

Try it yourself in the Qubrid AI Playground →

Playground Features

The Qubrid AI Playground lets you interact with Qwen3 Max directly in your browser — no setup, no code, no cost to explore.

🧠 System Prompt

Define the model’s role, language, reasoning depth, and output format before the conversation begins — ideal for enterprise assistants, multilingual workflows, and structured analysis pipelines.

Example: "You are a senior research analyst. Respond in the user's language.
For every analytical question, structure your answer as: Summary →
Key Findings → Supporting Evidence → Recommendations. Be precise
and cite limitations in your analysis."

Set your system prompt once in the Qubrid Playground and it applies across every turn of the conversation.

🎯 Few-Shot Examples

Guide the model’s output format and reasoning style with concrete examples — no fine-tuning, no retraining required.

User Input	Assistant Response
`Translate and summarize this paragraph in Spanish`	`[Translated summary in Spanish, preserving key facts and tone of the original]`
`Review this code and suggest improvements`	`Issues found: 1) O(n²) loop on line 12 — replace with hash map for O(n). 2) Missing null check on line 7. 3) Variable name 'x' is ambiguous — rename to 'user_count' for clarity.`

💡 Add few-shot examples in the Qubrid Playground to establish preferred output language, format, and domain focus — no fine-tuning required.

Inference Parameters

Parameter	Type	Default	Description
Streaming	boolean	`true`	Enable streaming responses for real-time output
Temperature	number	`0.7`	Controls creativity and randomness. Higher values produce more diverse output
Max Tokens	number	`4096`	Maximum number of tokens the model can generate
Top P	number	`1`	Controls nucleus sampling for more predictable output

Use Cases

Complex multi-step reasoning
Advanced coding and debugging
Research and analytical writing
Long-document summarization
Multilingual chat and translation
Enterprise chatbots and assistants

Strengths & Limitations

Strengths	Limitations
235B MoE architecture — frontier-level intelligence with 22B active per token	Closed-source — no self-hosting or weight access
Rivals GPT-4o and Claude Sonnet on key reasoning and coding benchmarks	Higher latency than smaller Qwen models
Up to 128K context window for long-document workflows	Higher cost per token vs open-source alternatives
Strong multilingual performance across 29+ languages	Function calling not supported
Excellent structured output and instruction following
Hybrid thinking mode for complex reasoning tasks

Why Qubrid AI?

🚀 No DashScope setup required — access Qwen3 Max directly via the Qubrid AI Serverless API with a single API key
🔁 OpenAI-compatible — drop-in replacement using the same SDK, just swap the base URL
💰 Cached input pricing — $0.24/1M for cached tokens, reducing costs significantly on repeated long-context workloads
🧪 Built-in Playground — prototype with system prompts and few-shot examples instantly at platform.qubrid.com
📊 Full observability — API logs and usage tracking built into the Qubrid dashboard
🌐 Multi-language support — Python, JavaScript, Go, cURL out of the box

Resources

Resource	Link
📖 Qubrid Docs	docs.platform.qubrid.com
🎮 Playground	Try Qwen3 Max live
🔑 API Keys	Get your API Key
🤗 Hugging Face	Qwen/Qwen3-Max
💬 Discord	Join the Qubrid Community

Built with ❤️ by Qubrid AI

Frontier models. Serverless infrastructure. Zero friction.

Getting started

GPU Compute

Inferencing

Qubrid AI Models

AI Tools

Overview

Model Specifications

Pricing

Quickstart

Prerequisites

Python

JavaScript

Go

cURL

Live Example

Playground Features

🧠 System Prompt

🎯 Few-Shot Examples

Inference Parameters

Use Cases

Strengths & Limitations

Why Qubrid AI?

Resources

Getting started

GPU Compute

Inferencing

Qubrid AI Models

AI Tools

Documentation Index

​Overview

​Model Specifications

​Pricing

​Quickstart

​Prerequisites

​Python

​JavaScript

​Go

​cURL

​Live Example

​Playground Features

​🧠 System Prompt

​🎯 Few-Shot Examples

​Inference Parameters

​Use Cases

​Strengths & Limitations

​Why Qubrid AI?

​Resources

Overview

Model Specifications

Pricing

Quickstart

Prerequisites

Python

JavaScript

Go

cURL

Live Example

Playground Features

🧠 System Prompt

🎯 Few-Shot Examples

Inference Parameters

Use Cases

Strengths & Limitations

Why Qubrid AI?

Resources